13 research outputs found
A Study of TypingRelated Bugs in JVM compilers
Ο έλεγχος των μεταγλωττιστών είναι ένα ερευνητικό πεδίο το οποίο έχει τραβήξει το ενδιαφέρον των ερευνητών την τελευταία δεκαετία. Οι ερευνητές έχουν κυρίως επικεντρωθεί στο να βρουν σφάλματα λογισμικού που τερματίζουν τους μεταγλωττιστές, και εσφαλμένες μεταγλωττίσεις προγραμμάτων οι οποίες οφείλονται σε σφάλματα κατά της φάσης των βελτιστοποιήσεων. Παραδόξως, αυτό το αυξανόμενο σώμα εργασίας παραμελεί άλλες φάσεις του μεταγλωττιστή, με την πιο σημαντική να είναι η μπροστινή πλευρά των μεταγλωττιστών. Σε γλώσσες προγραμματισμού με στατικό σύστημα τύπων που προσφέρουν πλούσιο και εκφραστικό σύστημα τύπων και μοντέρνα χαρακτηριστικά, όπως αυτοματοποιημένα συμπεράσματα τύπων, ή ένα μείγμα από αντικειμενοστραφείς και συναρτησιακά χαρακτηριστικά, ο έλεγχος σχετικά με τους τύπους στο μπροστινό μέρος των μεταγλωττιστών είναι περίπλοκο και περιέχει αρκετά σφάλματα. Τέτοια σφάλματα μπορεί να οδηγήσουν στην αποδοχή εσφαλμένων προγραμμάτων, στην απόρριψη σωστών προγραμμάτων, και στην αναφορά παραπλανητικών σφαλμάτων και προειδοποιήσεων.
Πραγματοποιούμε την πρώτη εμπειρική ανάλυση για την κατανόηση και την κατηγοριοποίηση σφαλμάτων σχετικά με τους τύπους στους μεταγλωττιστές. Για να το κάνουμε αυτό, μελετήσαμε 320 σφάλματα που σχετίζονται με την διαχείριση τύπων (μαζί με τις διορθώσεις και τους ελέγχους τους), τα οποία τα συλλέξαμε με τυχαία δειγματοληψία από τέσσερις δημοφιλής JVM γλώσσες προγραμματισμού, την Java, την Scala, την Kotlin, και την Groovy. Αξιολογήσαμε κάθε σφάλμα με βάση διάφορες πτυχές του, συμπεριλαμβανομένου του συμπτώματος του, της αιτίας που το προκάλεσε, της λύσης του, και των χαρακτηριστικών του προγράμματος που το αποκάλυψε.
Τέλος υλοποιήσαμε ένα εργαλείο το οποίο χρησιμοποιεί τα ευρήματα μας ώστε να βρει με αυτοματοποιημένο τρόπο σφάλματα στο μπροστινό μέρος των μεταγλωττιστών της Kotlin και της Groovy.Compiler testing is a prevalent research topic that has gained much attention in the past decade. Researchers have mainly focused on detecting compiler crashes and miscompilations caused by bugs in the implementation of compiler optimizations. Surprisingly, this growing body of work neglects other compiler components, most notably the front end. In staticallytyped programming languages with rich and expressive type systems and modern features, such as type inference or a mix of objectoriented with functional programming features, the process of static typing in compiler frontends is complicated by a highdensity of bugs. Such bugs can lead to the acceptance of incorrect programs, the rejection of correct programs, and the reporting of misleading errors and warnings.
In this thesis, we undertake the first-ever effort to the best of our knowledge to empirically investigate and characterize typingrelated compiler bugs. To do so, we manually study 320 typingrelated bugs (along with their fixes and test cases) that are randomly sampled from four mainstream JVM languages, namely Java, Scala, Kotlin, and Groovy. We evaluate each bug in terms of several aspects, including their symptom, root cause, bug fix’s size, and the characteristics of the bugrevealing test cases.
Finally, we implement a tool for finding frontend compiler bugs in Groovy and Kotlin compilers by exploiting the findings of our thesis
The Blockchain Imitation Game
The use of blockchains for automated and adversarial trading has become
commonplace. However, due to the transparent nature of blockchains, an
adversary is able to observe any pending, not-yet-mined transactions, along
with their execution logic. This transparency further enables a new type of
adversary, which copies and front-runs profitable pending transactions in
real-time, yielding significant financial gains.
Shedding light on such "copy-paste" malpractice, this paper introduces the
Blockchain Imitation Game and proposes a generalized imitation attack
methodology called Ape. Leveraging dynamic program analysis techniques, Ape
supports the automatic synthesis of adversarial smart contracts. Over a
timeframe of one year (1st of August, 2021 to 31st of July, 2022), Ape could
have yielded 148.96M USD in profit on Ethereum, and 42.70M USD on BNB Smart
Chain (BSC).
Not only as a malicious attack, we further show the potential of transaction
and contract imitation as a defensive strategy. Within one year, we find that
Ape could have successfully imitated 13 and 22 known Decentralized Finance
(DeFi) attacks on Ethereum and BSC, respectively. Our findings suggest that
blockchain validators can imitate attacks in real-time to prevent intrusions in
DeFi
On how zero-knowledge proof blockchain mixers improve, and worsen user privacy
One of the most prominent and widely-used blockchain privacy solutions are
zero-knowledge proof (ZKP) mixers operating on top of smart contract-enabled
blockchains. ZKP mixers typically advertise their level of privacy through a
so-called anonymity set size, similar to k-anonymity, where a user hides among
a set of other users.
In reality, however, these anonymity set claims are mostly inaccurate, as we
find through empirical measurements of the currently most active ZKP mixers. We
propose five heuristics that, in combination, can increase the probability that
an adversary links a withdrawer to the correct depositor on average by 51.94%
(108.63%) on the most popular Ethereum (ETH) and Binance Smart Chain (BSC)
mixer, respectively. Our empirical evidence is hence also the first to suggest
a differing privacy-predilection of users on ETH and BSC. We further identify
105 Decentralized Finance (DeFi) attackers leveraging ZKP mixers as the initial
funds and to deposit attack revenue (e.g., from phishing scams, hacking
centralized exchanges, and blockchain project attacks).
State-of-the-art mixers are moreover tightly intertwined with the growing
DeFi ecosystem by offering ``anonymity mining'' (AM) incentives, i.e., mixer
users receive monetary rewards for mixing coins. However, contrary to the
claims of related work, we find that AM does not always contribute to improving
the quality of an anonymity set size of a mixer, because AM tends to attract
privacy-ignorant users naively reusing addresses
SoK: Decentralized Finance (DeFi) Attacks
Within just four years, the blockchain-based Decentralized Finance (DeFi)
ecosystem has accumulated a peak total value locked (TVL) of more than 253
billion USD. This surge in DeFi's popularity has, unfortunately, been
accompanied by many impactful incidents. According to our data, users,
liquidity providers, speculators, and protocol operators suffered a total loss
of at least 3.24 billion USD from Apr 30, 2018 to Apr 30, 2022. Given the
blockchain's transparency and increasing incident frequency, two questions
arise: How can we systematically measure, evaluate, and compare DeFi incidents?
How can we learn from past attacks to strengthen DeFi security?
In this paper, we introduce a common reference frame to systematically
evaluate and compare DeFi incidents, including both attacks and accidents. We
investigate 77 academic papers, 30 audit reports, and 181 real-world incidents.
Our data reveals several gaps between academia and the practitioners'
community. For example, few academic papers address "price oracle attacks" and
"permissonless interactions", while our data suggests that they are the two
most frequent incident types (15% and 10.5% correspondingly). We also
investigate potential defenses, and find that: (i) 103 (56%) of the attacks are
not executed atomically, granting a rescue time frame for defenders; (ii) SoTA
bytecode similarity analysis can at least detect 31 vulnerable/23 adversarial
contracts; and (iii) 33 (15.3%) of the adversaries leak potentially
identifiable information by interacting with centralized exchanges
zk-Bench: A Toolset for Comparative Evaluation and Performance Benchmarking of SNARKs
Zero-Knowledge Proofs (ZKPs), especially Succinct Non-interactive ARguments of Knowledge (SNARKs), have garnered significant attention in modern cryptographic applications. Given the multitude of emerging tools and libraries, assessing their strengths and weaknesses is nuanced and time-consuming. Often, claimed results
are generated in isolation, and omissions in details render them irreproducible. The lack of comprehensive benchmarks, guidelines, and support frameworks to navigate the ZKP landscape effectively is a major barrier in the development of ZKP applications.
In response to this need, we introduce zk-Bench, the first benchmarking framework and estimator tool designed for performance evaluation of public-key cryptography, with a specific focus on practical assessment of general-purpose ZKP systems. To simplify navigating the complex set of metrics and qualitative properties, we offer a comprehensive open-source evaluation platform, which enables the rigorous dissection and analysis of tools for ZKP development to uncover their trade-offs throughout the entire development stack; from low-level arithmetic libraries, to high-level tools for SNARK development.
Using zk-Bench, we (i) collect data across different elliptic curves implemented across libraries, (ii) evaluate tools for ZKP development and (iii) provide a tool for estimating cryptographic protocols, instantiated for the proof system, achieving an accuracy of 6 − 32% for ZKP circuits with up to millions of gates. By evaluating zk-Bench for various hardware configurations, we find that certain tools for ZKP development favor compute-optimized hardware, while others benefit from memory-optimized hardware. We observed performance enhancements of up to % for memory-optimized configurations and % for compute-optimized configurations, contingent on the specific ZKP development tool utilized
SoK: Decentralized Finance (DeFi) Attacks
Within just four years, the blockchain-based Decentralized Finance (DeFi) ecosystem has accumulated a peak total value locked (TVL) of more than 253 billion USD. This surge in DeFi’s popularity has, unfortunately, been accompanied by many impactful incidents. According to our data, users, liquidity providers, speculators, and protocol operators suffered a total loss of at least 3.24 billion USD from Apr 30, 2018 to Apr 30, 2022. Given the blockchain’s transparency and increasing incident frequency, two questions arise: How can we systematically measure, evaluate, and compare DeFi incidents? How can we learn from past attacks to strengthen DeFi security?
In this paper, we introduce a common reference frame to systematically evaluate and compare DeFi incidents, including both attacks and accidents. We investigate 77 academic papers, 30 audit reports, and 181 real-world incidents. Our data reveals several gaps between academia and the practitioners’ community. For example, few academic papers address “price oracle attacks” and “permissonless interactions”, while our data suggests that they are the two most frequent incident types (15% and 10.5% correspondingly). We also investigate potential defenses, and find that: (i) 103 (56%) of the attacks are not executed atomically, granting a rescue time frame for defenders; (ii) SoTA bytecode similarity analysis can at least detect 31 vulnerable/23 adversarial contracts; and (iii) 33 (15.3%) of the adversaries leak potentially identifiable information by interacting with centralized exchanges
Artifact for "API-driven Program Synthesis for Testing Static Typing Implementations"
This is the artifact for the POPL'24 paper titled "API-driven Program Synthesis for Testing Static Typing Implementations"
Finding typing compiler bugs
We propose a testing framework for validating static typing procedures in compilers. Our core component is a program generator suitably crafted for producing programs that are likely to trigger typing compiler bugs. One of our main contributions is that our program generator gives rise to transformation-based compiler testing for finding typing bugs. We present two novel approaches (type erasure mutation and type overwriting mutation) that apply targeted transformations to an input program to reveal type inference and soundness compiler bugs respectively. Both approaches are guided by an intra-procedural type inference analysis used to capture type information flow. We implement our techniques as a tool, which we call Hephaestus. The extensibility of Hephaestus enables us to test the compilers of three popular JVM languages: Java, Kotlin, and Groovy. Within nine months of testing, we have found 156 bugs (137 confirmed and 85 fixed) with diverse manifestations and root causes in all the examined compilers. Most of the discovered bugs lie in the heart of many critical components related to static typing, such as type inference.</p
Well-Typed Programs Can Go Wrong: A Study of Typing-Related Bugs in JVM Compilers
Despite the substantial progress in compiler testing, research endeavors
have mainly focused on detecting compiler crashes and subtle
miscompilations caused by bugs in the implementation of compiler
optimizations. Surprisingly, this growing body of work neglects other
compiler components, most notably the front-end. In statically-typed
programming languages with rich and expressive type systems and modern
features, such as type inference or a mix of object-oriented with
functional programming features, the process of static typing in
compiler front-ends is complicated by a high-density of bugs. Such bugs
can lead to the acceptance of incorrect programs (breaking code
portability or the type system's soundness), the rejection of correct
(e.g. well-typed) programs, and the reporting of misleading errors and
warnings.
We conduct, what is to the best of our knowledge, the first empirical
study for understanding and characterizing typing-related compiler bugs.
To do so, we manually study 320 typing-related bugs (along with their
fixes and test cases) that are randomly sampled from four mainstream JVM
languages, namely Java, Scala, Kotlin, and Groovy. We evaluate each bug
in terms of several aspects, including their symptom, root cause, bug
fix's size, and the characteristics of the bug-revealing test cases.
Some representative observations indicate that: (1) more than half of
the typing-related bugs manifest as unexpected compile-time errors: the
buggy compiler wrongly rejects semantically correct programs, (2) the
majority of typing-related bugs lie in the implementations of the
underlying type systems and in other core components related to
operations on types, (3) parametric polymorphism is the most pervasive
feature in the corresponding test cases, (4) one third of typing-related
bugs are triggered by non-compilable programs.
We believe that our study opens up a new research direction by driving
future researchers to build appropriate methods and techniques for a
more holistic testing of compilers